{ "cells": [ { "cell_type": "markdown", "id": "1eb7016d-c75e-44f9-929d-8e60bc0b3840", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "# Loading Spectrum Data" ] }, { "cell_type": "raw", "id": "95faed62-f403-4863-970c-b0ab14d8b471", "metadata": { "editable": true, "raw_mimetype": "text/restructuredtext", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ ".. currentmodule:: massdash" ] }, { "cell_type": "raw", "id": "42dfd8ee-cbc2-4b0d-8867-394be19db190", "metadata": { "editable": true, "raw_mimetype": "text/restructuredtext", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "Spectrum data loaders implement the same methods as Chromatogram Data Loaders as well as some additional methods since more information can be gathered from spectrum data loaders. Fetching raw data with spectrum loaders takes more time since data is extracted on the fly. Additionally :py:class:`~structs.TargetedDIAConfig` must be specified to instruct how the peptide should be extracted. " ] }, { "cell_type": "code", "execution_count": 1, "id": "2f7b4ad2-0486-4f1b-ba89-11e7dd381b0e", "metadata": { "editable": true, "nbsphinx": "hidden", "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "%load_ext autoreload\n", "%autoreload 2" ] }, { "cell_type": "code", "execution_count": 2, "id": "f15a7806-3adb-41be-8d81-5e1eda910d71", "metadata": { "editable": true, "nbsphinx": "hidden", "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "# Please run this before executing any cell\n", "import os\n", "os.chdir(\"../../test/test_data/\") #### Insert path to data, this is the path to the tutorial data. " ] }, { "cell_type": "markdown", "id": "7e070013-5f19-468e-a620-f04d18fbc421", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "## Initiating a Spectrum Data Loader" ] }, { "cell_type": "markdown", "id": "f97a362b-3752-4992-9bb4-230a2b4b7810", "metadata": { "editable": true, "raw_mimetype": "text/markdown", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "Most [Spectrum Loaders](../generated/massdash.loaders.GenericSpectrumLoader.rst) require the following inputs. " ] }, { "cell_type": "markdown", "id": "dfd2e13c-4307-4663-bbae-16a885016e73", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "\n", "1. **dataFiles** - a list of raw data files \n", "2. **rsltsFile** - a `.osw` or DIA-NN `.tsv` file containing the features\n", "3. **libraryFile** - a `.tsv`/`.osw`/`.pqp` file contaning the library (m/z and annotations of all transitions)" ] }, { "cell_type": "raw", "id": "1094bb7e-5ba5-4378-885e-9f0ae5e659ae", "metadata": { "editable": true, "raw_mimetype": "text/restructuredtext", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "We can initiate a :py:class:`~loaders.MzMLDataLoader` object with follows. " ] }, { "cell_type": "code", "execution_count": 3, "id": "efbbf242-6cb5-4e67-a771-38d8d5357d9a", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Initializing valid scores for selection\n", "[2024-09-30 17:29:26,200] MzMLDataAccess - INFO - Opening mzml/ionMobilityTest.mzML file...: Elapsed 0.08319735527038574 ms\n", "[2024-09-30 17:29:26,201] MzMLDataAccess - INFO - There are 50 spectra and 0 chromatograms.\n", "[2024-09-30 17:29:26,202] MzMLDataAccess - INFO - There are 25 MS1 spectra and 25 MS2 spectra.\n" ] } ], "source": [ "from massdash.loaders import MzMLDataLoader\n", "loader = MzMLDataLoader(dataFiles=\"mzml/ionMobilityTest.mzML\",\n", " rsltsFile=[\"osw/ionMobilityTest.osw\", \"diann/ionMobilityTest-diannReport.tsv\"])" ] }, { "cell_type": "markdown", "id": "dde208d2-b025-427e-8061-4fafcfa7ec5f", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "
| \n", " | run | \n", "Annotation | \n", "rt | \n", "int | \n", "
|---|---|---|---|---|
| 0 | \n", "ionMobilityTest | \n", "prec | \n", "6225.005106 | \n", "229.011734 | \n", "
| 1 | \n", "ionMobilityTest | \n", "prec | \n", "6226.792950 | \n", "26.001631 | \n", "
| 2 | \n", "ionMobilityTest | \n", "prec | \n", "6228.580932 | \n", "57.999416 | \n", "
| 3 | \n", "ionMobilityTest | \n", "prec | \n", "6230.367189 | \n", "826.008179 | \n", "
| 4 | \n", "ionMobilityTest | \n", "prec | \n", "6232.156436 | \n", "1589.015259 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 163 | \n", "ionMobilityTest | \n", "y9^1 | \n", "6259.292755 | \n", "4355.988281 | \n", "
| 164 | \n", "ionMobilityTest | \n", "y9^1 | \n", "6261.101406 | \n", "1168.029907 | \n", "
| 165 | \n", "ionMobilityTest | \n", "y9^1 | \n", "6262.909095 | \n", "1286.014038 | \n", "
| 166 | \n", "ionMobilityTest | \n", "y9^1 | \n", "6264.711573 | \n", "413.995209 | \n", "
| 167 | \n", "ionMobilityTest | \n", "y9^1 | \n", "6266.515136 | \n", "1217.012207 | \n", "
168 rows × 4 columns
\n", "| \n", " | native_id | \n", "ms_level | \n", "precursor_mz | \n", "mz | \n", "rt | \n", "im | \n", "int | \n", "Annotation | \n", "product_mz | \n", "
|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "\n", " | 1 | \n", "642.3295 | \n", "642.334187 | \n", "6225.005106 | \n", "0.900254 | \n", "76.000458 | \n", "prec | \n", "642.3295 | \n", "
| 1 | \n", "\n", " | 1 | \n", "642.3295 | \n", "642.334187 | \n", "6225.005106 | \n", "0.969271 | \n", "153.011276 | \n", "prec | \n", "642.3295 | \n", "
| 2 | \n", "\n", " | 2 | \n", "642.3295 | \n", "504.262011 | \n", "6225.110817 | \n", "0.935281 | \n", "68.001518 | \n", "y4^1 | \n", "504.2664 | \n", "
| 3 | \n", "\n", " | 2 | \n", "642.3295 | \n", "504.262011 | \n", "6225.110817 | \n", "1.025902 | \n", "41.000328 | \n", "y4^1 | \n", "504.2664 | \n", "
| 4 | \n", "\n", " | 2 | \n", "642.3295 | \n", "504.262011 | \n", "6225.110817 | \n", "0.926001 | \n", "43.000782 | \n", "y4^1 | \n", "504.2664 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 6812 | \n", "\n", " | 2 | \n", "642.3295 | \n", "1065.546118 | \n", "6266.515136 | \n", "0.975441 | \n", "8.999968 | \n", "y9^1 | \n", "1065.5463 | \n", "
| 6813 | \n", "\n", " | 2 | \n", "642.3295 | \n", "1065.551224 | \n", "6266.515136 | \n", "0.986777 | \n", "33.001766 | \n", "y9^1 | \n", "1065.5463 | \n", "
| 6814 | \n", "\n", " | 2 | \n", "642.3295 | \n", "1065.551224 | \n", "6266.515136 | \n", "0.923945 | \n", "84.003464 | \n", "y9^1 | \n", "1065.5463 | \n", "
| 6815 | \n", "\n", " | 2 | \n", "642.3295 | \n", "1065.556331 | \n", "6266.515136 | \n", "0.910546 | \n", "63.997871 | \n", "y9^1 | \n", "1065.5463 | \n", "
| 6816 | \n", "\n", " | 2 | \n", "642.3295 | \n", "1065.556331 | \n", "6266.515136 | \n", "0.921891 | \n", "54.000694 | \n", "y9^1 | \n", "1065.5463 | \n", "
6817 rows × 9 columns
\n", "